Propositionalisation and Aggregates

نویسندگان

  • Arno J. Knobbe
  • Marc de Haas
  • Arno Siebes
چکیده

The fact that data is scattered over many tables causes many problems in the practice of data mining. To deal with this problem, one either constructs a single table by hand, or one uses a Multi-Relational Data Mining algorithm. In this paper, we propose a different approach in which the single table is constructed automatically using aggregate functions, which repeatedly summarise information from different tables over associations in the datamodel. Following the construction of the single table, we apply traditional data mining algorithms. Next to an in-depth discussion of our approach, the paper presents results of experiments on three well-known data sets.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Good and Bad Practices in Propositionalisation

Data is mainly available in relational formats, so relational data mining receives a lot of interest. Propositionalisation consists in changing the representation of relational data in order to apply usual attribute-value learning systems. Data mining practitioners are not necessarily aware of existing works and try to propositionalise by hand. Unfortunately there exists some tempting pitfalls....

متن کامل

Reduction of ILP Search Space with Bottom-Up Propositionalisation

This paper introduces a method for algorithmic reduction of the search space of an ILP task, omitting the need for explicit language bias. It relies on bottom-up propositionalisation of examples and background knowledge. A proof of concept has been developed for observational learning of stratified normal logic programs.

متن کامل

Propositionalisation of Profile Hidden Markov Models for Biological Sequence Analysis

Hidden Markov Models are a widely used generative model for analysing sequence data. A variant, Profile Hidden Markov Models are a special case used in Bioinformatics to represent, for example, protein families. In this paper we introduce a simple propositionalisation method for Profile Hidden Markov Models. The method allows the use of PHMMs discriminatively in a classification task. Previousl...

متن کامل

Approaching the ILP 2005 Challenge: Class-Conditional Bayesian Propositionalization for Genetic Classification

This report presents a statistical propositionalisation approach to relational classification and probability estimation on the genetic ILP Challenge domain. The main difference between our and existing propositionalisation approaches is its ability to construct features from categorical attributes with many possible values and in particular the object identifiers. Our classification and rankin...

متن کامل

Lazy Propositionalisation for Relational Learning

A number of Inductive Logic Programming (ILP) systems have addressed the problem of learning First Order Logic (FOL) discriminant definitions by first reformulating the FOL learning problem into an attribute-value one and then applying efficient learning techniques dedicated to this simpler formalism. The complexity of such propositionalisation methods is now in the size of the reformulated pro...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001